3574 results found.
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Catalan Chinese English Esperanto French German Italian Kabyle Kinyarwanda Persian Polish Russian Spanish Welsh
Availability:
Freely Available
License:
Creative Commons license
Size:
8.8k hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Laurent Besacier | Common Voice | /N |
Documentation:
https://arxiv.org/pdf/1912.06670.pdf, English, public
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution 4.0 License
Size:
55.5 hoursProduction Status:
Existing-used
Use:
Voice Control
-
Paper title:PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification
-
Paper track:14.16 Privacy-preserving Machine Learning for Audi/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chao-Han Huck Yang | Google Speech Commands Dataset Version 2 | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
LDC
Size:
686 MByteProduction Status:
Existing-used
Use:
Phoneme detection algorithm training
-
Paper title:Fricative Phoneme Detection Using Deep Neural Networks and its Comparison to Traditional Methods
-
Paper track:5.7 Detection, inference, and segmentation of phon/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Metehan Yurt | TIMIT Acoustic-Phonetic Continuous Speech Corpus | /N |
Documentation:
There is a documentation in English and it is publically available.
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic English French
Availability:
Freely Available
License:
Creative Commons Attribution-NonCommercial- NoDerivs 3.0 license
Size:
18.3 MByteProduction Status:
Existing-used
Use:
word classification
-
Paper title:Token-Level Supervised Contrastive Learning for Punctuation Restoration
-
Paper track:9.3 Language modelling/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Qiushi Huang | International Workshop on Spoken Language Translation | /N |
Documentation:
http://hltc.cs.ust.hk/iwslt/index.php/evaluation-campaign/ted-task.html
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
LDC
Size:
5.4 hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Online Blind Audio Source Separation Using Recursive Expectation-Maximization
-
Paper track:5.8 Source separation/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sharon Gannot | TIMIT Acoustic-Phonetic Continuous Speech Corpus | /N |
Documentation:
https://catalog.ldc.upenn.edu/docs/LDC93S1/
Speech
Corpus,
Language Type:
Bilingual
Languages:
English Hebrew
Availability:
Freely Available
License:
Size:
100 MByteProduction Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:An Agent for Competing with Humans in a Deceptive Game Based on Vocal Cues
-
Paper track:5.12 Other topics in Analysis of Speech and Audio /Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Amos Azaria | Cheat Game - Vocal Statments | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
6.3 GByteProduction Status:
Existing-used
Use:
-
Paper title:End-to-end Optimized Multi-stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding
-
Paper track:6.1 Speech coding and transmission/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mohammad Hassan Vali | LibriSpeech ASR corpus | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
We are working on making this dataset available.
License:
Size:
20 hoursProduction Status:
Newly created-in progress
Use:
Evaluation of dereverberation algorithms
-
Paper title:Scene-Agnostic Multi-Microphone Speech Dereverberation
-
Paper track:6.7 Dereverberation for speech signals/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sharon Gannot | BIUREV and BIUREV-N | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons License: Attribution 4.0 International
Size:
10 GByteProduction Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:Cross-lingual Speaker Adaptation using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
-
Paper track:7.11 Cross-lingual and multilingual aspects in spe/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Detai Xin | CSTR VCTK Corpus | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
68 minutesProduction Status:
Existing-used
Use:
Machine Learning
-
Paper title:Using Transposed Convolution for Articulatory-to-Acoustic Conversion from Real-Time MRI Data
-
Paper track:1.1 Models of speech production/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ryo Tanji | USC-TIMIT | /N |
Documentation:
[6] S. Narayanan, A. Toutios, V. Ramanarayanan, A. Lammert, J. Kim, S. Lee, K. Nayak, Y.-C. Kim, Y. Zhu, L. Goldstein, D. Byrd, E. Bresch, P. Ghosh, A. Katsamanis, and M. Proctor, “Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC),” The Journal of the Acoustical Society of America, vol. 136, no. 3, pp. 1307–1311, Sep 2014.




